Multics Technical Bulletin MTB-194 


Tos Distribution 
From? Paul Green 
Date: 05/08/75 


Subject: A random word generator for Multics 


Morrie Gasser of the MITRE Corporation has written a set of 
programs that are capable of generating pronouncable Engiish 


words at random. Enctosed with this MTB Is the = draft 
documentation for the various moduftes which comprise the word 
generator. Comments on the user interface are especially 


welcomes; send them to GreeneHDOruid and Gasser.ADruld on the MIT 
Multics system. 


The random word generator (random_word_) is a table-driven 
program that returns an array of numbers (units) which form a 
word. The unlts are supplied by aosubroutine that Is 
calter-specifled. The standard version of this subroutine is 
named random_unit_.. aitlthough there is no requirement that the 
units themselves be random. — | 


The parameters to random_word_ are the number of tetters 
That may appear in the generated word, and the random_unit_ 
subroutine. The random_word_ routine calis random_unit_ 
repeatediy to get units, each time determining from a “digranm 
table” whether the returned unlit may be added to the end of the 
word being generated, according to the rules encoded in the 
digram table. Units which satisfy the rules are added to the end 
of the generated word; units which do not satisfy the rules are 
ignored. Units are requested until the tength in tletters meets 
the calier‘s criterla. 


The table that drives random_word_ is referenced as an 
exfernat array with the name “digrams_“. This table can be 
Prepared by the user by creating an ASCII segment specifying the 
rujes, and compiling it with the digram_tablie_compiler. The 
digram table is in two parts. The first part specifies one or 
two ietter symbols that define each unlt, and some flags that 
define various rules for each unite The second part tists every. 
possible pair of these units (ieee, If there are n units then 
there are n*¥n pairs), and contains several more flags for each 
pair that define rules about combining pairs. 


Muitics Project internat working documentatione Not to be 
reproduced or distributed outside the Muiltics Project. 
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Onty the digram table itself is specifically English-oriented; 
the symbolic representation of the units and letters Is 
unimportant to the digram_tabie_compiter and random_word_ (except 
that the number of ftetters in each unit is used fo determine now 
long the generated word is). The random_word_ and random_unit_ 
subroutine operate upon unlit Indices, not the actual ASCII 
characters. These unit indices may be converted back to their 
character represenations by calling the convert_word_ subroutine. 


As the word generator currently exists, the random_unit_ 
subroutine “knows” what units exist in the digram table, what 
their frequencles of occurance are, and which ones have specific 
attributes. Thus it does not have to reference the digram table. 
For that reason, if it deslred to replace the digram table, the 
random_unlt_ subroutine must also be replaced. Some of these 
dependencies could have been eliminated by having the 
random_unit_ subroutine reference fhe digram table on the first 
call to determine which units exist, but this was not done _ for 
reasons of efficiencye The onty unit attribute that random_unit_ 
cares about is the “vowel™ attribute, for the entrypoint 
random_unit_$random_vowel. For these reasons, a new digram table 
can be created (without replacing random_unit_) onty if the 
English-tletter representation of the units, and the order of the 
units, is not modified. 


Note that onty the command interface (generate_words) wlll 
be user-visible$ the rest of the modules wlll remain Internal 
Interfaces. 
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Name: generate_word_ 


This subroutine returns a random pronounceable word as _ an 
ASCII character stringe If also returns the same word spilt by 
hyphens Into syilabiles as an ald to pronunciation. 


Usage 


declare generate_word_ entry (char(*), char(*), fixed bin, 
flxed bin)$ 


call generate_word_ (word, hypnhenated_word, mins max); 


1) word is the random word, padded on the right with 
bianks. This string must be long enough to 
hold the word (at feast as tong as max). 
(Output) 


2) hyphenated_word is the same word split into syilables. The 
length of this string must be greater than 
max to alfow for the hyphens. A tength of 


3*max/? + 1 wit always be suffictient. 
(Output) 
3) min is tne minimum tength of the word to be 


generated. This vaiue must be greater than 3 
and fess than 21e (Input) 


&) max is the maximum flength of the word to be 
generated. The actual fength of the word . 
will be unlformiy random betfween min and max. 
The value of max must be greater then or 
equal to min, and tess than 21. (Input) 


Note 


Each call to generate_word._ should produce a different 
random word, regardiess of when the call Is made. However, aS 
with any random generator, there is no guarantee that there will 
be no duplicatese The probabllity of duplication is greater with 
shorter wordse 
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Entry! generate_word_Sinl t_seed 


This entry allows the user to specify a starting seed for 
generating random words. If a seed is specified, the exact same 
sequence of random words will always be generated on subsequent 
calls to generate_word_ providing the same vatues of min and max 
are specified. If this entry is not catled in a process, the 
vatue of the clock is used as the Initial seed on the first call 
to generate_word_, thereby “guaranteeing” different sequences of 
words in different processes. 


Usage 
declare generate_word_Sinlt_seed entry (fixed bin(35))3 
call generate_word_$init_seed (seed); 


1) seed is the Initial seed vaiue. If zero, the system 
clock wmlli be used as the seed. (Input) 
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Name: generate_words, gw 


Tnis command will print random pronounceable “words” on the 
user*s terminal. 


Usage 


generate_words -control_args- 


14) control_args may be setfected from the following? 


nwords 


“min 


“max 0 


-fength pe 


-Ina 


“-hyphenate, -hoh 


~seed SEED 
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ad 


Hone 


is the number of words to oprint. If not 
specified, one word Is printed. 


specifies the minimum length. in characters, 
of the words to be generated. 


specifies the maximum tength of the words to 
be generated. 


specifies the tength of the words to be 
generated. If this argument Is specifled, atl 
words wilf be this lengths, and -min or -max 
may not be specified. 


causes the hyphenated form (divided into 
syliabies) of each word to be printed 
alongside the original sword. 


On the first call to generate_words in a 
process, the system clock is used to obtain a 
starting “seed” for generating random words. 
This seed is updated for every word generated, 
and subsequent vaiuves of the seed depend on 
previous values (in a rather complex way). If 
the -seed argument is specifled, SEED must be 
a positive decimal integer. For a given vaiue 
of SEEO, the sequence of random words will 
always be the same providing the same _ tength 
values are specified. When no -seed argument 
is specified, the tast value of the updated 
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seed from the previous catti to generate_words 
will be used. To revert back to using the 
system clock as the seed, specify a zero value 
for SEED, 1.@e, -seed De 

Notes 


If neither  -mln, -maxs nor -tength are specified, the 
defaults are -min 6 and ~max 8 In ali ofher cases, the defaults 
are -min & and -max 20<- 


If -iength is not specified, the tengths of the random words 
wild be unlforaty distributed between min and max. Words 
generated are printed one per tine, with the hyphenated forms, if 
specified, lined up in a cofumn alongside the original words. 
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Name? convert_word_ 


This subroutine is used to convert the random word array 
returned by random_word_ to ASCII. 


Usage 


dei convert_word,_ entry ((02*) fixed bin, (03*) bit(1) 
aligned, fixed bin, char(*), char (*))35. 


call convert _word_ (word, hyphenated_word, word_ltengths, 
asciil_word, ascii_hyphenated_word); 


1) word Array of random units returned from ae previous 
cat! to random_word_. (Input) 


2) hyphenated_word Array of bits indicating where hyphens are _ to 
be placed, returned from random_word_. (Input) 


3) word_f!ength Number of units in aord, returned from 
random_word_. (Input) 


&) asclil_word This string will contain the word, left justified, 
with tralling blanks. This string shouid be iong 
enough to hotd the ftongest word that may obe 
returned. This is normally the value of “maximum” 
supplied to random_word_. (Output) 


5) ascli_hnhyphenated_word This string willl contain the word, with 
hyphens between the syllables, feft justified 
within the string. The tength of this string) 
should be at least 3*maximum/2+1 to guarantee that 
the hyphenated word wll! fit. (Output) 


Entcy? convert_word_$no_hyphens 


This entry can be used to obtain the ASCII form of a random 
word without the hyphenated form. 


Usage 
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det convert_word_$no_hyphens ((0%*) fixed bin, fixed bin, 
char €*))3 


call convert_word_$Sno_hyphens (word, word_length, 
ascii_word)$ 


Arguments are the same as above. 


a cement 
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Name? convert_word_char_ 


This subroutine facllitates printing of the hyphenated word 
returned from a cal! to hyphenate_. 


Ysage 


dct convert word _char_ entry (char (*), ¢€*) bit({1) atigned, 


fixed bin, char(*) varying) $ 


cali convert word _char_ (word, hyphens, last, result)$ 


1) word 


2) hyphens 


3) last 


&) result 
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This string is the word to be hyphenated. (Input) 


This is the array returned from ae cail to 
hyphenate_ that marks characters in word after 
which hyphens are to be inserted. (Input) 


This is the status code returned from hyphenate_.- 
If negative, the result willl be the original word, 
unhyphenated, with ** folloning ite If positive, 
the word will be returned hyphenateds but with an 
asterisk preceding the tast*th character. If 
zero, the word witt be returned hyphenated without 
any asteriskse (Input) 


This string contains the resultant hyphenated 
word. (Output) 
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Name? digram_tabie_compller, dtc a 


This command complies a source segment containing the 
digrams for the random word generator and produces an object 
segment with the name “digrams_"“. 


Usage 
dlgram_table_compiler pathname -optlion- 


4) pathnaae Is the pathname of the source segment. If 
the suffix “.dtc™ does not appear, it wilt! be 
assumed. Regardiess of the name of the 
source segment, the output segment wit 
always be given the name “digrams_” and wlit 
be placed in the working directory. 


2) ~-option- may be the following: 


“list, -is lists the complied table on the terminal. 
The tabte wltt be printed in columns fo fit 
the terminal tine tength. If flle_output Is 
belng used, fines wiil be 132 characters 
tonge 


-ftist ne -is n lists the table as above, but uses n as the 
number of cotumns fo print. Each column 
occupies 14 positions, thus a vatue of 5 will 
cause 5 columns to be oprinted, each tine 
belng 70 characters tong. This option is 
useful when file_output Is being used, so 
that the tines produced are not too tong to 
fit on the terminal to be used to print § the 
output file. 


Notes 


The compller makes an attempt to detect Inconsistent 
combinations of attributes, as wesJl as syntax errors. If an 
error Is encountered during compliation, processing of the source 
segment willl continue if possible. TYThe digrams segment in case 
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of an error will be fteft in an undefined state. 


During compifation, the ALM assembler is used. At that 
polnt the tetters “ALM™ will be printed on the terminal. If 
compllation was successful, no other messages should appear. 


The tisting produced by digram_table_compiler is In a format 
sultable for printing on the terminal -- not for dprinting. This 
is because blank tines are used for page breaks, instead of the 
“new page” character as recognized by dprint. 


ayn tax 


The syntax of the source segment is specifled betow. Spaces 
are meaningful to this compller and a space is onty allowed where 
specifled as <space>. The new line character is indicated as 
<new line>. 


<digram table>3:= <unit specs>3{<new fine>j...<digram specs>% 
<unlf specs>t!= <unit spec>{<delim><unlt spec>l..e. 
<digram specs>3i= <digram spec>{<delim><digram spec>l.ec. 
<deltim>2%= ,{<new tine>jli<new ltine> 
<unit spec>tt= <unilt name>{<not begin word>{<no final sptit>)] 
<digram spec>32= (<begin><not begin><break><prefix>]) 
<unit name><unit name>i<suf flx>{<end>{<not end>))} 
<unlt name>tt= <fetter>{<ietter>] 
<letter>22= albicidtielfigihilipikittiminiotpiaqirisittulviwixtiyiz 
<not begin word>tt= <bIi t> 
<no final sptit>%%t= <bit> 
<begin>28%= <bit> 
<not begin>t!= <bit> 
<bpreak>t%:= <bit> 
<prefix>2%= <space>ti- 
<sufflx>%t= <space>i-i+ 
<end>?%= <blit> 
<not end>?3= <bif> . 
<bif>%t3= <space>iz 


The first part of the <digram table> consists of definitions 
of the various units that are to be used and their attributes. 
The units are defined as one or two-letter pairs, and the order 
in which they are defined is unimportant. For each unit, the 
attributes <not begin word> and: <no final split> may be 
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specified. In addition, if <unlt name> is as @, Ly Os or Us the 
“vowel” attribute is set. If the unit Is ys the 
“alternate vowel” attribute is set. A <bit> is assumed to be 
zero If specifled as <space>, or one If specified as ie 


The second part of <digram table> specifies ali possibte 
pairs of units and the attributes for each pair. The order tin 
which these pairs must be specified depends on the order of the 
<unift specs> as follows? 

Number the <unit spec>s from i ton in the order In which 
they appeared In <unit specs>. The first <digram spec> must 
consist of the pair of unlts numbered (491). # =the second 
<digram spec> is the palr (1:92). etce, and the tast <digram spec> 
is the palr (nen). Ail pairs must be specifled, Leese, there must 
be n¥n <digram spec>s. The <bit>s preceding or following each 
pair set the attributes for that pair as shown. The <prefix> and 
<suffix> indicators are set to i if specified as *-™, Tf 
<suffilx> is specified as “4#"*, the “iltegal pair” indicator wiltl 
be set, and no other attributes may be specified for that 
<digram spec>. 


The foltowIing Is a very short example of a <digram table>. 
Onty four units are defined, “a“, “by, “sh” and “e”. The tetter 
“e“ Is given the “no final sptit™ attribute, the pair “aa™ is 
given “illegal pair™, tne palr “ae“ is given the “not begin, 
“preak"™, and “not end™ attributes, etc. 


asbsshe,e 13 

aat,zabesash, 11 ae 1 

bas 1 bb, 11 bSh isbe 

shay 11 shb 1.shsh+,sshe, ea,eb,esh,ee 
$ 


Assume the above segment was named “dt.dtc™. Below is an 
exampie of the command used to compile and ftilst the table 
produced for dt. 


digram_table_compller dt -is 
ALM 
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4a 09010 2b 90000 3 sh 0000 & e 0110 


000 aa +090 000 ba 400 090 sha 400 000 ea 00 
000 ab 400 0140 bb O00 011 shb 01 000 eb 00 
000 ash 40 011 bsh O14 000 shsht900 000 esh 60 
011 ae 01 000 be 00 000 she 00 000 ee 060 


The first tine of output tists the Individual units. The 
number preceeding the unlit is the unlit Index. The four bits 
following the unit are respectively: 


not begin syllable 
no final split 
vowe ft 

alternate vowei 


Following the unit specifications are the digram specifications. 
Preceeding each digram are three bits and a space (or possibfy a 
"-")} with meanings corresponding to those specified in the source 
segment as foilons? 


begin 

not begin 

break 

prefix (if “-"“ appears) 


Immediately foilowing each digram is a fleid which may be blank, 
“—=", or “4%, If “+%, the “illegat pair™ flag is set. Otherwise, 
the meaning of the “-" and following two bits are as fotlows: 


suffix (if “=-" appears) 


end 
not end 


eae 
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Names? hyphen_test 


This command uses the random word generator (the same one 
used by generate_words) to divide words into sylfabtes. Words 
are printed on the terminal with hyphens between the syllablies. 


Usage 
hyphen_test -control_arg- -word{- e2. -wordn- 


1) controt_arg may be -probablliity (-pb),. specifying that 
the probabliity of each of the words that 
follows be printed alongside the hyphenated 
wor de 


2) word] are one or more words to be hyphenated. A 
word may consist of three to twenty 
alphabetic characters, onty the first of 
vhich may be uppercase. 


Notes 


The control argument may appear anywhere in the command 
line. However, It only applles to words that follow. Words 
preceding the option willl) be hyphenated but no probabilities wilt 
be calculated. 


If a word contains any ILtlegal characters, or Is not of 
three to twenty characters in length, the word wilt be printed 
unhyphenated, followed by **. 


If the word could not be completely hyphenated because it 
was consldered unpronounceable, an asterisk (*) will be printed 
out in front of the first character that was not accepted. The 
part of the word before the asterisk will be properly hyphenated. 


The calculated probablilty is the probability that the word 
would have been generated by generate_words, assuming 
generate_words was requested to generate aword of that tength 
onty. If a range of tengths Is requested of generate_words, each 
length has equal probability. for exampie, if generate_words is 


PR RLD CO! ALLTEL MTSE FE SLA ED TATED SS CTR 


MTB-194& Honeywell Information Systems, Ince 


hyphen_test MPLM SYSTEM TOOLS 


Command 
Page 2 
05/08/75 


calted to generate words of 6, 7, or 8 characters, there is a 33% 
probabitity that a given word witli tnave 8 characters. § If 
hyphen_test is then asked to calculate the probabllity of a given 
8 tletter word, that probability should be divided by 3 to obtain 
the correct probabltity for the case of three possible fengths. 
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Napme: hyphenate_ 


This subroutine attempts to hyphenate a word Into syltiables. 


Usage 


dct hyphenate_ entry (char(*), (*) bit(4) aligned, fixed 
bin) $ 


call hyphenate_ (word, hyphens, code); 


1) word This is a teft Justifled ASCII string, 3 to 20 
characters in tengthe Tris string must contain 
all towercase alphabetic characters, except the 
first character may be uppercase. Tralting blanks 
are not permitted in this string. (Input) 


2) hyphens This array wlll) contaln a “1i"b for every character 
In the word that is to have a hyphen following it. 
(Out put) 

3) code Thls is a status code, as fottows: 


0 word. has been successfully hyphenated. 
-1 word contains iftlegal (non aiphabetic or 
uppercase) characterse 
-2 word was not from three to twenty characters 
in length. 


Any positive value of code means that the word 
couldn*t be completely nhynhenated. In this case, 
code is the position of the first character In 
word that was not acceptable. The part of the 
word before code willl be property hyphenated. 
(Output) 


Notes 


This subroutine uses random_word_ to provide the 
hyphenation. It does this by calling random_word_$give_up and 
supptying its own version of random_unit and random_vowel that 
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return specified units (of the particular word to be hyphenated) 
instead of random unlts. 


The word supplied to hyphenate_ Is first transformed into 
units by franstating pairs of tftetters into singte units if a 
2-letter unit Is defined for the pairs and then by transtating 
the remaining single tetters into units. See the subroutine 
description of random_word_ and random_unlt_ for a description of 
unitse If any units of the word are rejected by random_word_, 
hyphenate_ tries to determine if the refused fetter was a 
2-letterr unit. If this Is the case, the 2-letter unit is broken 
into two i-fetter units and random_word_ Is calied again. In 
rare cases, hyphenate_ is not able to determine which 2-letter 
unit is at fault, and wllfl return a status code Indicating that 
the word Is unpronounceable, whens in fact, if could have been 
properly divided by breaking up a 2-Jetfter unit. 


Entry! hyphenate_$probabi lity 


This entry returns information as above, but aiso supplies 
the probability of the word having been generated at random by 
generate_word. or random_word_generator_. The assumption Is made 
that generate_word_ or random_word_generator_. was asked to supply 
a word of exactiy the same iength as the word given to 
hyphenate_, rather than aoe range of tengths. If a range of 
fengths was asked of generate_word_» the probabillty must be 
divided by the number of different tengths (all tengths are 
equally probable). 


Usage 


dcl nhyphenate_Sprobability entry (char(*), (*) bit(1) 
aligned, fixed bin, float bin); 


calf hyphenate_$Sprobability (word, hyphens» code, 
probabltity)$ 


1) to 3) are as above. 


&) probablility is the probablilty as deflned above. (Output) 
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Notes 


It the supplied word is ilflegal (lee. code is not zero), the 
probability wil!f be returned as zeroe 


Entry? hyphenate_Sdebug_on, hyphenate_S$debug_off 


These entries set and reset a switch that causes 
nhyphenate_$probability to print, on user_output, all units (see 
the subroutine descriptions of random_word_ and random_unit_ for 
a description of units) fhat are ilflegali in a given position of 
the word. This entry Is useful for debugging a digram tabte for 
random_word_. It makes no assumptions about the Information 
contained in the digram table with regards to which unlts are 
defined, thelr distributions, the order of the units, etc. 
However, If assumes that a cal! to random_unit_$Sprobability wiltt 
return arrays of the size digrams_tn_units containing the 
probabliitles of the units that are defined. See the subroutine 
description of random_unit_ for a description of the 
random_unit_$probability entry, and the subroutine description of 
random_word_ for a description of digrams_. 


Vsage 


dci hyphenate_tdebug_on entry}; 
dcl hyphenate_S$debug_off entry; 


calt hyphenate_S$debug_on; 
call!’ hyphenate_Sdebug_off; 


Notes 


An example of the output produced is as follows. The 
assumption Is that hyphenate_$probabitity Is Invoked by the 
hyphen_test command using the -probabltity option. 


nhyphenate_$debug_on 

hyphen_test -probability fish 

KeCkKe ls DaeCodafadeNaleoksMaNsDeSataVoWeXsVaZaCNegNaDhy 
rheshsthewh,queckeds Isrhswhsequeshs 

fish 6.04127576e-5 
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In the above example, the units x and ck are shown to have been 
Llitegal as the first unit of the word, and the unlt =f, 
({undertined) is the first unit of the word that was accepted. 
Alt other units that were not printed are Jegai as the first unit 
of the word. Following the semicolon after £ are the units that 
are litegal in the second position of the word (assuming that f 
is the first unit). Then 1 Is shown as the flegal unit that is 
taken from the word “fish”. This repeats for each position of 
the word, ending in the flegai unit sh (note only one undertine). 


If the supplied word is if_legal, the tast undertined ietter 
in the output is (usualiy) the fetter that was not accepted. In 
cases where hyphenate_ has to split up a 2-tetter unit, the word 
wllf be shown to start over from the beginning. 
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Name: print_digram_table 


Thls entry merely prints the dlgram table on the terminat, 
assuming that It has already been complied successfully. The 
segment “digrams_“ is assumed to be ftocated In the’ working 
directorye 


Usage 
print _digraa_table -n- 


yn Is the number of cofumns in which to print the table. 
If not specified, the maximum number of columns that 
witt fit in the terminal tine will be used. Each 
coluan occuples i4 positions. If flle_output is belng 
used, the terminal tine width Is assumed to be 132. 


Notes 


This entry performs the same function as the -ltist option of 
digram_tabie_complier. 
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Name? random_unit_ 

This subroutine provides a randoam unit number for 
random_word_ based on a standard distribution of a given set of 
units. It is referenced by the generate_word_ subroutine as_= an 
entry vatue that is passed in the cal! to random_word_. This 
Subroutine assumes that the digram  tabie belng used by 


random _word_ Is a standard tabie. The digram table itseif Is not 


referenced by this subroutine. 


Usage 


declare random_unit_ entry (fixed bin); 


call’ random_unit. (unit)$ 


41) unit ils a number from 1 to 34 that corresponds to a 
particular unlit as iisted In Notes below. (Output) 
Notes 
The tabie below contains the units that are assumed 
specified in the digrams supptled to random_word_. Shown in the 
table are the unit number, the ftetter or tetters that unit 
represents, and the probability of that unit number belng 
generated. 
41 a 294739 8 h 02844 15 o 2.84739 22 w .03792 29 rh 00474 
3 c -05687 10 }) 203792 17 r .04739 24 y 03792 31 th .90948 
& d .05687 11k 03792 18 s .03792 25 z -00474% 32 wh .00474 
5 e .05687 12 § .02844 19 t .04739 26 ch .00474% 33 qu .00474 
6 f 93792 13 m .92844 20 u 002844 27 gh 00474 34 ck .00474 | 
7 g 203792 14 n -04739 21 v 0903792 28 ph 00474 
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Entry! randos_unlit_trandom_vonel 
This entry returns a vowel unit number only. 
Usage 
deciare randoa_unIi t_%$random_vowel (fixed bind; 


calf’ random_unit_$random_vowel (unlt)$ 


1) unlt As above. (Output) 


Below are tlsted the vowel units and thelr distributlons. 


2167 


1 a 

5 e e250 

9 | .167 
15 0 e167 
20 u 2167 


Entry? random_unit_S$probabitities 


This entry returns arrays containing the probabilities of 
the units as tisted in the table on the previous page. This 
entry is provided for hyphenate_$probability and any other 
program that might require this information. The probabitities 
must be computed when this entry Is called, so it is suggested 
that the cali be made only once per process and the values saved 
in Internat static storage. 


Usage 


declare random_unit_Sprobabilities entry ((*) float bin, (*) 
float bin); 


call random_unit_$probabltities (unlt probs, vowel_probds)} 
4) unit probs This array contains the probablliities of the 


Individual units assuming the random_unit_ entry 
Is called to generate the random unitse The value 
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of unit_probs(1) is the probability of unit(i). 
(Out put) 


2) vowel_probs This array contains the probabllities of the units 
. when random_vowel is called. Since there are only 
6 vowels, most of these values wil! be zero. 

(Out put) 


Notes 


A future verslon of random_unit_ may use different units 
with different probabilities. The size of the two arrays must be 
large enough to hold the maximum number of values that may be 
returned by random_unit._ (which is currently 34). Programs 
should pot depend on the unit_index-to-letter correspondence as 
shown in the tabiee This Information can be obtained by using 
the inctude fite digram_structure.inci.pii. 


MTB-194 Honeyweli Information Systems, Inc. 


MPLM SYSTEM TOOLS _ , random_word_ 


Subroutine 


05/08/75 


Name’ random_word_ 


This routine returns a single random pronounceable word of 
specifled length. If Is called by generate_word_, and afilows the 
caller to specify the particular subroutines to be used to 
generate random unitse For users desiring random words with an 
Engtish-tilke distribution of letters, generate_word_ shoutd be 
used. 


Usage 


del random_word_ entry ((93*) fixed, (9%*) bit(1) aligned, 
fixed, fixed, entry, entry); 


call random_word_(word, hyphens, char_length, unit_tength, 
random_units, random_vowel) 3; 


1) word The random word wilt be stored In this array 
starting at word(1) (word(n) wlll always be 0). 
The numbers stored wiif’ correspond to a “unit 
index” as described In Notes below. This array 
must have a length at least equal to the vatue of 
“char_fength”. Unused positions in thls arrays up 


to wor d{char_length), will be set to zeroe 
(Out put) 
2) hyphens This array must be of fength at least 


“char_fength". A bit on Ina position of this 
array indicates that the corresponding unit in 
“word™ (including the very tast unit) is the last 
unit of a syltabte. (Output) 


3) char_fength Length of the word to be generated, In characters. 
(Input) 


&) unit_iength This is the length of the generated random word in 
units. 1.ee, the index of the tast non-zero entry 
in the “word™ array. The actual tength of the 
word in equivalent characters wlll be the vatue of 
char_fength.e (Output) 


el 
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5) random_unit This is the routine that wlll be catted by 
. random_word. each time a random_unit Is needed. 
The random_unit routine Is dectared as follows? 


dc! random_unlt entry (fixed bind; 


where fhe vatue returned is a unit index between 1 
and nuunitse If an English-tike distribution of 
fetters is desired, the “random_unit_" subroutine 
may be specified here. See Notes below. (Input) 


6) random_vowet | 
This is the routine caited by random_word_ when a 
vowel unit is required. This routine must return 
the index of a unit whose “vowel” or 
“aiternate_vowel™ bits are one See Notes below. 
This routine is declared as follows: 


dc! random_vowel entry (fixed bin)3. 


If desired, the subroutine 
“random_unit_frandom_voweit” may be specified in 
this piace. (Input) 


Notes 


The word array can be converted into characters by calling 
convert_sword_-« 


In order to use random_word, a digram table, contained in a 
segment named “digrams_™, must be avallabie in the search path. 
This table can be created by the digram_table_compiter. 


If the user suppiles his own versions of srandom_unit and 
random_vowel, these subroutines will have to suppiy tegat units 
that are recognized by the random_word_ subroutine. The Inctude 
fite “digram_structure.inci.pii™ can be used to reference the 
digram table to determine which units are available. If Included 
in the source program, appropriate references to the _ fotlowing 
variables of interest in “digrams_” will be generated: 


dct n_units fixed bin defined digrams_tn_units3 
dct fetters(O%n_units) char(2) aligned 
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based(addr(di gams_Stetters)) $ 
del 1 rutes(n_units) aligned based(addr(dlgrams_Srules)), 
2 vowel bit (1), 
2 alternate_vowel bit(1), 


where’ 
n_units is the number of different units. 
fetters(1) contains 1 or 2 characters (jeft Justified) 


for the i*th unit. 


rules.evowel(1), ruleseaiternate_vowel (1) 
One of these two bits are set for the units 
that may be returned by a call to 
random_vowele 


When random_unit Is called, a number from i to n_units must 
be returned. When random_vowel Is catieds a number from 1 to 


n_units, where one of the two bits in rules(1) is marked, must be 
returned. 


Entry! random_word_Sdebug_on 

This entry sets a switch in random_word_ that causes 
printing (on user_output) of partial words that could not be 
completed. This entry is of interest during debugging of 


random_word_ or for checklng the consistency of the digram table 
prepared by the user. 


Usage 
dct random_word_$debug_on entry; 
cali random_word_$debug_on; 
Entry? random_word_$debug_of f 


This entry resets the switch set by debug_on. 


a a a el 
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Additionai notes 


The random_word_ subroutine can be used for certain special 
applications (such as the application used by hyphenate_)» and 
there are certain features that help support some of these 
applications. The features described below are of tittte 
Interest to most users. 


The first feature allows the caller-supplied random_unit 
{and random_vowei) subroutine to find out whether random_word_ 
“accepted” or “rejected” the previous unit supplied by 
random_unit. Each time random_unit is Invoked by random_word_, 
the vatue of the argument passed is the index of the previous 
unit that random unit_ returned (or zero on the first calll to 
random_unit in a glven invocation of random_word_). The sign of 
the argument will be positive if this last unit was accepted. 
“accepted” means that the tast unit was inserted Into the random 
word and the word index maintained by random_word_ was 
incremented. Once a unit Is accepted, It is never removed. Thus 
a positive value of the unlt index passed to random_unit means 
that a unlit for the next position of tne word is requested. 


If the unit Index passed to random_unit has a negative sign, 
the tast unit was rejected according to the rules used by 
random_word_ and information supplied in the digram table. If 
the unit is rejected, random word. does not advance its word 
index and cailts random_unit again for another unit for that same 
word positlone With this Information random_unlt can keep track 
of the “progress” of the word being generated. 


The feature described above is used by the specla! 
random_unit routine provided by hyphenate_. Since the 
random_unit routine for nyphenate_ is not reality supplying random 
units (but Is supptyling units of the word to be hyphenated), it 
must know whether any particular unit Is rejected by 
random_word_. Rejection then implies that the word is iitlegal 
according to random_word_ rules. 


The second feature allows random_unit to “fry” a cerfain 
unlt without committing that unit to actually be used in the 
random word. The sign of each unit suppiled to random_word_ by 
random_unit is checked. If the sign of the word is positive, 
random_word_ wili accept or reject the unit according to its 
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rules, and will indicate this on the subsequent cail to 
random_unIit. 


If the sign of the unlit passed to random_word_ is negative, 
random _word_ wll! merely indicate (on the subsequent call to 
random_unit) whether that unit would have been accepted, but it 
never actually updates the word index. In other words, 
random _word_ always rejects the units but lets random_unit know 
whether the unit was acceptable. 


This tatter feature Is used by hyphenate_$probablitity in 
order to determine which of aff possible units are acceptable in 
a given position of the word. The random_unlt routine used by 
hyphenate_S$probabltity tries all possible units in each word 
position, and only allows random_word_ to accept the unit that 
actually appears in that position. 
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Name? read_table_ 


This subroutine Is the compiler for the digram tabie for 
random word. It is called by digram_table_compiter. 


Ysage 


declare read_table_ entry (ptr, fixed bin(24), returns 
(bit (4325 


fiag = read_table_ (source_ptr, bitcount)$ 


4) source_ptr is a pointer to the source segment to be complied. 


(Input) 
2) bitcount is the bit count of the source segment. (Input) 
3) flag is “Ob if compilation was successful. It is “1b 


if an error was encountered. 
Notes 


If compitation was successful, the complied table will be 
placed in the working directory with the name “digrams_“. If 
unsuccessful, the digrams segment may or may not have been 
created, and may be left In an inconsistent state (1l.e., unusable 
by random_word_). Error messages are printed out on user_output 
as the errors are encountered, except that flie system errors are 
printed on error_output. 


This subroutine uses the ALM assembfer for part of its work. 
As a result, the tetters “ALM" wlll be printed on user_output 
sometime during the compllation. 


ee eee a ee ean ae Ae eS 


MTB-19f Honeywell Information Systems, Ince 


